A comparison of acoustic coding models for speech-driven facial animation
نویسندگان
چکیده
This article presents a thorough experimental comparison of several acoustic modeling techniques by their ability to capture information related to orofacial motion. These models include (1) Linear Predictive Coding and Linear Spectral Frequencies, which model the dynamics of the speech production system, (2) Mel Frequency Cepstral Coefficients and Perceptual Critical Feature Bands, which encode perceptual cues of speech, (3) spectral energy and fundamental frequency, which capture prosodic aspects, and (4) two hybrid methods that combine information from the previous models. We also consider a novel supervised procedure based on Fisher s Linear Discriminants to project acoustic information onto a lowdimensional subspace that best discriminates different orofacial configurations. Prediction of orofacial motion from speech acoustics is performed using a non-parametric k-nearest-neighbors procedure. The sensitivity of this audio–visual mapping to coarticulation effects and spatial locality is thoroughly investigated. Our results indicate that the hybrid use of articulatory, perceptual and prosodic features of speech, combined with a supervised dimensionality-reduction procedure, is able to outperform any individual acoustic model for speech-driven facial animation. These results are validated on the 450 sentences of the TIMIT compact dataset. 2005 Elsevier B.V. All rights reserved.
منابع مشابه
Speech Recognition with Hidden Markov Models in Visual Communication
Speech is produced by the vibration of the vocal cords and the configuration of the arti-culators. Because some of these articulators are visible, there is an inherent relationship between the acoustic and the visual forms of speech. This relationship has been historically used in lipreading. Today's advanced computer technology opens up new possibilities to exploit the correlation between acou...
متن کاملNatural head motion synthesis driven by acoustic prosodic features
Natural head motion is important to realistic facial animation and engaging human-computer interactions. In this paper, we present a novel data-driven approach to synthesize appropriate head motion by sampling from trained Hidden Markov Models (HMMs). First, while an actress recited a corpus specifically designed to elicit various emotions, her 3D head motion was captured and further processed ...
متن کاملSpeech and Expression Driven Animation of a Video-Realistic Appearance Based Hierarchical Facial Model
We describe a new facial animation system based on a hierarchy of morphable sub-facial appearance models. The innovation in our approach is that through the hierarchical model, parametric control is available for the animation of multiple sub-facial areas. We animate these areas automatically both from speech to produce lip-synching, and natural pauses and hesitations and using specific tempora...
متن کاملCarnival-combining speech technology and computer animation.
Speech is powerful information technology and the basis of human interaction. By emitting streams of buzzing, popping, and hissing noises from our mouths, we transmit thoughts, intentions, and knowledge of the world from one mind to another. We’re accustomed to thinking of speech as an acoustic, auditory phenomenon. However, speech is also visible. Although the primary function of speech is to ...
متن کاملSpeech-driven 3d facial animation for mobile entertainment
This paper presents an entertainment-oriented application for mobile service, which generates customized speech-driven 3D facial animation and delivers to end-user by MMS (Multimedia Messaging Service). Some important methods of this application are discussed, including the 3D facial model based on 3 photos, the 3D facial animation driven by speech or text on-line and the video format transform...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 48 شماره
صفحات -
تاریخ انتشار 2006